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Abstract — In this paper the performance hmits and design 
principles of rateless codes over fading channels are studied. 
The diversity-multiplexing tradeoff (DMT) is used to analyze 
the system performance for all possible transmission rates. It is 
revealed from the analysis that the design of such rateless codes 
follows the design principle of approximately universal codes 
for parallel multiple-input multiple-output (MIMO) channels, in 
which each sub-channel is a MIMO channel. More specifically, it 
is shown that for a single-input single-output (SISO) channel, the 
previously developed permutation codes of unit length for parallel 
channels having rate LR can be transformed directly into rateless 
codes of length L having multiple rate levels (R, 2R, . . . , LR), to 
achieve the DMT performance limit. 

I. Introduction 
A. Background 

Rateless codes present a class of codes that can be trun- 
cated to a finite number of lengths, each of which has a 
certain UkeUhood of being decoded to recover the entire 
message. Compared with conventional coding schemes having 
a single rate R, such codes can achieve multiple rate levels 
{R, 2R, . . . , LR), depending on different channel conditions. 
A rateless code is said to be perfect if each part of its codeword 
is capacity achieving. Compared with conventional codes, 
rateless codes offer a potentially higher rate. Several results 
have been obtained on the design of perfect rateless codes over 
erasure channels and additive white Gaussian noise (AWGN) 
channels (see [6] and the references therein). 

Unlike in the fixed channel scenario, non-zero error proba- 
bility always exists in fading channels, when the instantaneous 
channel state information (CSI) is not available at the trans- 
mitter and a codeword spans only one or a small number of 
fading blocks. In this scenario, it is well known that there is 
a fundamental tradeoff between the information rate and error 
probability over fading channels, which can be characterized 
as the diversity-multiplexing tradeoff (DMT) [1]. 

Definition 1 (DMT): Consider a multiple-input multiple- 
output (MIMO) system and a family of codes C,, operating at 
average SNR rj per receive antenna and having rates R. The 
multiplexing gain and diversity order are defined as 

R 



r = lim 

rj^oo log2 rj 



and d^-limi^ii^ii^, (1) 

r(^oo log2 rj 



where P^ (R) is the average error probability at the transmis- 
sion rate R. 

The DMT is an effective performance measure for imple- 
menting the rateless coding principles in a fading channel. Two 



main concerns naturally arise: (a) determining the DMT limit 
for rateless coding with finite numbers of blocks in a fading 
environment and discovering how it performs with regard to 
conventional schemes; and (b) determining DMT achieving 
codes that are simple (in the sense of encoding and decoding 
complexity). 

B. Contributions of the Paper 

In this paper, we analyze the DMT performance of rateless 
codes. The results show that, compared with conventional 
coding schemes having multiplexing gain r„, rateless codes 
having multiple rates (r„, 2r„, . . . , Lr„) offer an effective 
multiplexing gain r of Lr„, given the same diversity gain at 
every rate, when r„ is small. As r„ increases, the performance 
of rateless codes degrades and ultimately becomes the same as 
that of conventional schemes. Also while increasing L lifts up 
the overall system DMT curve, it does not necessarily improve 
the system multiplexing gain for every fixed value of r„. It 
is then revealed that the design of such rateless codes follows 
the principle of parallel channel codes that are approximately 
universal [3] over fading channels. More specifically, it is 
shown that for a single-input single-output (SISO) channel, the 
formerly developed unit length permutation codes for parallel 
channels [3] having rate LR can be transformed directly 
into rateless codes of i-length having multiple rate levels 
{R, 2R, . . . , LR), to achieve the DMT performance limit. For 
multiple-input multiple-output (MIMO) channels, the results in 
the paper suggest a type of rateless codes that may be viewed 
as a combination of conventional MIMO space-time codes and 
parallel channel codes, both of which have been designed for 
fading channels. 

C. Related Work 

The performance of rateless coding over fading channels 
has also been considered in [4], in which the throughput and 
error probability are discussed. However, the tradeoff between 
these two was not analyzed explicitly. For example, the results 
in [4] shows that increasing the value of L will decrease the 
system error probability in certain scenario and is therefore 
desirable. In this paper we show that while this discovery 
is true, the system throughput, i.e., multiplexing gain might 
decrease when L becomes larger for every fixed value of r„. 
Overall, our results reveal that the optimal design of rateless 
codes requires the consideration of both r„ and L. 



Rateless coding may be considered as a type of Hybrid- 
ARQ scheme [2]. The DMT for ARQ has been revealed in 
[2]. However, it will be shown in the paper that this DMT 
curve was incomplete and represents the performance only 
when r„ < min(A/, N)/L in which M and N are the 
number of transmit and receive antennas. The complete DMT 
curve for rateless coding including those parts for higher r„ 
has never been revealed before, and will be shown in this 
paper In addition to this, the results in this paper also offer 
a relationship between the design parameter (i.e., r„ and L) 
and the effective multiplexing gain r of the system, thus offer 
further insights into system design and operational meaning 
compared to conventional coding schemes. Furthermore, we 
suggest new design solutions for rateless codes. Previous work 
on finite-rate feedback MIMO channels relies on either power 
control or adaptive modulation and coding (e.g., [5]), which 
are not necessary for our scheme. 

The rest of this paper is organized as follows. The system 
model is proposed in Section II. In Section III, the DMT 
performance of rateless codes is studied. In Section IV, design 
of specific rateless codes over fading channels is discussed. 
Finally, concluding remarks are made in Section V. 

II. System Model 

We consider a frequency-flat fading channel with M trans- 
mit antennas and N receive antennas. We assume that the 
transmitter does not know the instantaneous CSI on its cor- 
responding forward channels, while CSI is available at the 
receiver. Each message is encoded into a codeword of L 
blocks. Each block takes T channel uses. We assume that 
the channel remains static for the entire codeword length 
(i.e., L blocksU The system input-output relationship can be 
expressed as 



each X; so thaO 



Y 



(2) 



where X e C^^'^^ is the input signal mati'ix; H e C^^''^ is 
the channel transfer matrix whose elements are independent 
and identically distributed (i.i.d.) complex Gaussian random 
variables with zero means and unit variances; N € C^^"^^ is 
the AWGN matrix with zero mean and covariance matrix I; 
and Y e £^nxtl j^ jj^g output signal matrix. P is the total 
transmit power, which also corresponds to the average SNR ij 
(per receive antenna) at the receiver 

The input signal matrix X can be written as 



X= Xi 



Xl ] 



(3) 



where X; £ C ^ is the codeword matrix being sent during 
the kh block, and its corresponding receiver noise matrix is 
denoted by N; G C^^"^. We impose a power constraint on 

'Note, however, that the analysis in the paper can be extended straightfor- 
wardly to a faster fading scenario in which the channel varies from block to 
block during each codeword transmission. 



E 



1 
T 



s^M, 



(4) 



for / = 1, 



.L. 



A. Conventional Schemes 

Assume that the transmitter sends the codeword at a rate R 
bits per channel use. A message of size RT is encoded into a 
codeword X; (^ = 1, . . . , L) and transmitted in T channel uses. 
An alternative method is to encode a message of size RLT into 
X. Both encoding methods will offer the same performance 
provided that T is sufficiently large. 

B. Rateless Coding 

When rateless coding is applied, we wish to decode a 
message of size RLT with the codeword structure as shown 
in ^. During the transmission, the receiver measures the total 
mutual information / between the transmitter and the receiver 
and compares it with RLT after it receives each codeword 
block X; . If / < RLT after the ^th block, the receiver remains 
silent and waits for the next block. If / > RLT after the ^th 
block, it decodes the received codeword [ Xi • ■ • X; ] 
and sends one bit of positive feedback to the transmitter Upon 
receiving the feedback, the transmitter stops transmitting the 
remaining part of the current codeword and starts transmitting 
the next message immediately. 

Unlike conventional schemes, this process will bring mul- 
tiple rate levels (i?, 2R, . . . , LR). For example, if / > RLT 
after the first block is received (i.e., 1 = 1), the receiver will 
be able to decode the entire message and the rate becomes 
LR. Similar observations can be made for I = 2 . . . L. There- 
fore, compared with conventional schemes, the corresponding 
transmission rate achieved by using rateless codes is always 
equal or higher. Specifically, we define the multiplexing gain 
for each rate level as (r„, 2r„, . . . , Lrn) where 

A ,. R 

Tn = lim . 

77^00 log2 77 

Later we will show through the DMT analysis that rateless 
coding can retain the same diversity gain as conventional 
schemes, but with a much higher multiplexing gain especially 
when the corresponding r„ is low. 

III. Performance analysis 

Denote by e; the decoding error when decoding is per- 
formed after the /th block (0 < Z < L) and by Pr {eul) the 
joint probability that a decoding error occurs and decoding is 
achieved after /th block. The system overall error probability 
can be expressed as 



P.= 



L 

E 
1=1 



Pr(ei,0- 



^Note that this is a more stiict constraint than letting E 
which offers at least the same performance. 



I|X|| 



<M, 



Define p{l) (0 < I < L) to be the probability with 
which / < RLT after the /th block, and note that p (0) = 
1. Following the steps in Section II. B in [2], the average 
transmission rate for each message in bits per channel use 
is given by 

- "^ . (5) 



R 



L-l 

Ep(0 

1=0 



Note that this R describes the average rate with which the 
message is removed from the transmitter, i.e., it quantifies 
how quickly the message is decoded at the receiver We define 
the effective multiplexing gain of the system as 

r ^ 

r ~ lim . 

r)^ + oo log2 r] 

Define / (fc) to be the piecewise linear function connect- 
ing the points (fc, (M — k) {N — k)) for integral k ~ 
0, ..., min(Af, N). Recall that a conventional scheme operating 
at multiplexing gain r„ (0 < r„ < min(M, N)) would have 
the diversity gain /(r„). The following theorem shows the 
performance of rateless coding for < r„ < +oo. 

Theorem 1: Assume a sufficiently large T. For 
rateless codes having multiple multiplexing gain levels 
(r„, 2r„, . . . , Lvn), the corresponding DMT can be expressed 
as (r, d) where 



for 



L 



min (M, N) ^ r„ 



< — min (M, TV) 

1j 



and ; = 1, 2, ...L. Finally, d == for r„ > min(M, N). 

Proof: See Appendix A. ■ 

Note that for rateless coding to achieve the performance 
in Theorem 1, we do not necessarily require T -^ +cx). As 
long as T is large enough such that the error probability 
Pr {ei, I) ^ rj^^"^-^' for each I, the DMT in Theorem 1 can be 
achieved. While the minimal T for a general MIMO channel 
when applying rateless coding is unknown to the authors, it 
will be shown later that for SISO channels, T = 1 is sufficient 
to achieve the optimal DMT in Theorem 1. 

Comparing rateless coding with conventional schemes, it 
can be shown that for < r„ < min(Af, N)/L, r — Lvn for 
d = f [rn)- In this scenario rateless coding can improve the 
multiplexing gain up to L times that of conventional schemes, 
given the same diversity gain. Fig. 1 gives an example when 
M — N — 2 and L — 2, and < r„ < 1. The operating point 
A in the curve for a conventional scheme for < r„ < 1 
corresponds to point B in the curve for rateless coding. 

An important observation from Theorem 1 is that the system 
performance will not be improved after r„ {almost) reaches 
min(A/, N)/L, as the optimal DMT is akeady achieved by 
using rateless coding. This is mainly due to the fact that 
the first block can no longer support the message size when 
the message rate reaches min(M, N)/L. Thus the system 




— Conv. scheme 
■■Rateless coding 



Fig. 1. The DMTs for conventional schemes and rateless coding for < 

rn <1. M = N = 2, L = 2. 




Fig. 2. The DMTs for different schemes for < r„ < 3. M = N 
L = 4. 



multiplexing gain decreases for the same diversity gain, and 
finally offers the same DMT as conventional schemes when 
the first L — 1 blocks all fail to decode the message. Fig. 
2 shows an example when M = N = 3, L = 4. This 
observation also implies that for any fixed value of r„, simply 
increasing the value of L does not necessarily improve the 
system DMT performance. Although the overall system DMT 
will increase when L is larger, the multiplexing gain might 
decrease for certain fixed values of r„. A convenient choice 
for L would be in the region of L < min(M, A^)/r„. However, 
note that the maximal multiplexing gain min(M, N) can be 
achieved only with zero diversity gain, and this happens when 
r„ = min(Af , N) regardless of the value of L. 

IV. Design of rateless codes 

Note that codewords X^ (1 < i < L) in (O are transmitted 
through different channels that are orthogonal in time. This is 
analogous to transmitting X^ through different channels that 
are parallel in space. In the (space) parallel channel model, 
elements in {X^} can be jointly (simultaneously) decoded. 
However, for the channel model considered in this paper, 
which we now call the rateless channel, the decoding process 
needs to follow certain direction in time, i.e., we start decoding 
from Xi, then [Xi X2] if Xi is not decoded, etc. This 
comparison implies that while good parallel channel codes 



can be used as the basis for rateless coding, they might need 
modifications in order to offer good performance over the 
rateless channel. 

Specifically, for the rateless channel expressed in the form 
of ^, we consider the corresponding parallel MIMO channel, 
in which each sub-channel is a MIMO channel, having the 
following input-output relationship: 



Y= \ — 








H 



/Xi\ /Ni\ 



VXl/ 



(6) 



vnl; 



where H, X^ and N^ are the same as those in (|2]i. It is easy 
to see that the DMT for this system is d = f [j^) forO<r< 
L min(Af , N). Assuming a code that achieves this DMT, when 
we implement its transformation [ Xi ■ • • Xl ] into the 
rateless channel having multiple rates (r„, 2r„, . . . , ir„), it is 
not difficult to show that 



Pr(£L,L)sc:77 



-/('■,. 



(7) 



In order to make the overall P^ ^ r]^^^'^"\ we need to ensure 
that Pr (£;,;) <?7"-^('"") for 1 < ? < L - 1. However, those 
conditions are not essential in order to achieve the optimal 
DMT for the parallel channel shown in Q, which only 
requires the condition (|7]l. Thus stricter code design criteria 
are required for the rateless channel. One example of such a 
criterion is the approximately universal criterion [3]. 

Codes being approximately universal for parallel channels 
ensure that the highest error probability when decoding any 
subset of {Xi} in the set of all non-outage events decays 
exponentially in SNR (i.e., in the form of e~'' for some 
(5 > 0) under any fading distribution, and thus can be ignored 
compared with the outage probability under the same fading 
distribution, when the SNR goes to infinity. Specifically, we 
consider the following parallel MIMO channel which is more 
general than the one in ©I 




/Hi 



V 



X /XA /Ni\ 



H, 



vw 



(8) 



\^lJ 



where each channel matrix in {Hi} (1 < i < L) follows 
an arbitrary distribution. In particular, when the matrices in 
{Hi} are i.i.d. and of the same distributions as the H in (|2]l, 
following the same steps as those in [1], it is not difficult to 
show that the optimal DMT for this system is d = Lf (-j-) 
for < r < Lmm{M, N). Now, we are ready to state 
the following theorem considering the performance of rateless 
codes that are transformed from the approximately universal 
codes for the parallel channel in (O. 



Theorem 2: Suppose a code [ X^ 



XI 



IS ap- 



proximately universal for the parallel channel shown in ^ 
and can achieve the DMT points (Lr„, Lf (r„)) for < r„ < 
min(Af , N) when the channel matrices have i.i.d. Rayleigh 
fading. Then, its transformation [ Xi • ■ • X^ ] , when 
applied to the rateless channel shown in (|2|i aiming at multiple 



multiplexing gains (r„, 2r„, . . . , Lrn), can achieve the DMT 
shown in Theorem 1. 

Proof: See Appendix B. ■ 

While approximately universal codes for the general parallel 
MIMO channel is unknown to the authors, approximately 
universal codes for parallel SISO channels do exist, and can 
be transformed directly into good rateless codes for SISO 
channels. In the following, we apply permutation codes for 
parallel channels [3] to the rateless channel. 

Permutation codes are a class of codes generated from QAM 
constellations. In the encoding process, a message is mapped 
into different QAM constellation points across all subchannels. 
The constellation over one subchannel is a permutation of 
the points in the constellation over any other subchannel. 
The permutation is optimized such that the minimal codeword 
difference is large enough to satisfy the approximate univer- 
sality criterion. Explicit permutation codes can be constructed 
using universally decodable matrices. We refer the readers to 
[3] and the references therein for details. It has been shown 
that permutation codes achieve the optimal DMT for parallel 
channels and have a particularly simple structure. For example, 
the codewords are of unit length. 

Assume the transmission rates over rateless channel are 
(i?, 2i?, . . . , LR) bits per channel use. To implement permu- 
tation codes, we choose a codebook of size 2^^ (messages) 
for the parallel channel in (O. Each message is mapped into a 
code [ X^ • • • X|^ ] , in which each X; is an 2-^-'^-point 
QAM constellation. The message can be fully recovered as 
long as any subset of {X/} can be correctly decoded. Now, 
we transform this code into the form [ Xi ■ • • X^ ] for 
the rateless channel. Since Pr(£;,Z) decays exponentially in 
SNR due to the approximate universality of such codes, the 
overall error probability is always dominated by that upon 
receiving all X/ for infinitely high SNR. More precisely, we 
summarize the above observations as the following corollary. 

Corollary 1: Rateless codes that are transformed from per- 
mutation codes for parallel channels can offer exactly the same 
performance as shown in Theorem 1 over the SISO rateless 
channel. 

Proof: The proof is a direct extension of the proof of 
Theorem 2 and is omitted. ■ 



V. Conclusions 

The performance of rateless codes has been studied for 
MIMO fading channels in terms of the DMT. The analysis 
shows that design principles for rateless codes can follow 
these of the approximately universal codes for parallel MIMO 
channels. Specifically, it has been shown that for a SISO 
channel, the formerly developed permutation codes of unit 
length for parallel channels having rate LR can be transformed 
directly into rateless codes of length L having multiple rate 
levels (R, 2R, . . . , LR), to achieve the desired optimal DMT 
performance. 



Appendix 

A. Proof of Theorem 1 

Define r^ — Lvn. Following the steps in [1], it is easy 
to show that p{l) = rj^f\~i~) for I ^ 0. We write the error 
probability as 



L-l 



Pe = ^(l-p(0)Pr(£0+Pr(eL,L) 



(9) 



1=1 



In (|9|, Pr [ei) is error probability when ll\j > LTR, where /{, 
is the mutual information of the channel in each block. Using 
Fano's inequality we can obtain the error probability lower 
bound [1]: 

Pe>Pr(£L,i)^77"-^(*). 

Since r < r^, we have rj^f\~r7) > j-j^f\~)^ and thus the 
desired performance upper bound is obtained. 

Now we prove the achievability part. Consider Pr(e;). 
Following the same argument as in the proof of Theorem 
10.1.1 in [8], we get 



Pr(e;) s;3e 



(10) 



for sufficiently large T. Note that a very similar argument has 
been made in Lemma 1 in [7], although it is claimed there that 
both T and L are required to be sufficiently large in order to 
satisfy (fTol l. Now (|9]) can be further rewritten as 



P. 



€ 



3(L-l)e 



11 



-fi'^) 



+ (l-p(L))Pr(£i) 



V 



(11) 



Note that 



R 



LR 



L-l 



LR 



1+ E^^^^"^) 



i=l 



for < tl < min(Af, N). Thus r — r^ and diversity gain 
/ (■^) is achievable in the range < r < min(M, A^). Note 
that r^ = Lrn, and thus we have d ^ f (r„) for 



r = r„L, < r„ < 



min(M, N) 

Z ' 



So far we have only considered the scenario in which 
Now the question to ask is what happens if 



Tn < 



niin(M,N) 



we increase the value of r„ to '"'"^^ ' — - and beyond. In this 
scenario, / (^) =0, and thus R = -^. The message rate r 
is decreased to ri/2 due to the fact that after the first block 
the receiver has no chance of decoding the message correctly 
and it always needs the second block. However, the system 
error probability P^ is not changed. Therefore the message 
rate becomes 

L min(Af,iV) 2min(Af,iV) 

r = r„--, i — '■ — ^ < r„ < ^ — '■ — '-, (12) 

2' L - L ' V ; 

and the system DMT becomes 
'2r\ min(A'/,7V) 



ci=/(^),- 



<r< iRm{M,N). 



(13) 



Similarly, when r reaches min(Af, N) again, i.e., r„ reaches 

IS i?= -^ 

3 min(Af , N) 



2min(J\/,JV) r /rL^ _ f ( 2r\ 

L 2mm{M,N) 



0. Thus i? = -^ and 



r„ • 3 . 



< r„, < 



(14) 



the system DMT becomes 

/3r\ 2min(A^,7V) 
d=f i—j , Y^ - "^ < "™(^'^' ^)- ^^^^ 

Continuing following the above until R = R, we obtain the 
desired result and the proof is completed. 

B. Proof of Theorem 2 

Assume that the system in (|6]) transmits at a rate LR = 
rL log2 ?/. The probability of any decoding error can be upper 
bounded by [1] 

P^Po+ Pe\o^ 

where Pq is the outage probability and Pe\o<' is the average 
error probability given that the channel is not in outage. 
Approximately universality means that for such codes Pe|o<= = 
e"'' under any fading distribution. For the system in ^, these 
include the fading distributions in which Hi = • • ■ = H; 
follow the same distribution as the H in (|2]) and Hj+i = 
• ■ • = Hi = for all 1 < ? < L - 1. When such codes 
are transformed into the rateless channels shown in (O, it is 
a simple matter to show that 

Pr (£0-^010^=6-"' 

for any 1 < I < L, where Pr(e/) is given in (|9]). Thus 
the system error probability for the rateless channel in (|2]) 
is always upper bounded by 

Pe s^ ie""' 

The rest of the proof follows that of Theorem 1 and is omitted. 
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